Optimization by Claude Code w/ Opus #78

cometkim · 2025-07-30T16:12:58Z

Can Claude Code optimize? Let's see

Claude suggested several improvements that wouldn't break the test.

Verifying the claims

While the claims sound reasonable, theory and reality can differ, so I broke down the suggestions and verified them separately.

Claim 1. Binary search operation

Problem: Used signed right shift (>>) in binary search
Solution: Changed to unsigned right shift (>>>) for better performance
Impact: Minor but consistent improvement in Unicode range lookups

Experimeted in #79

Result: False (but helpful suggestion)

It makes no difference even though it is the most hit code.

>>> and >> shouldn't differ; the codebase already carefully handles the integers. However, I accepted the change because it makes sense to rewrite the code a bit more compactly.

Claim 2. String build optimization

Problem: Character-by-character string concatenation using segment += input[cursor++]
Solution: Replace with input.slice(segmentStart, cursor) to avoid repeated string allocations
Impact: Significant reduction in memory allocations and GC pressure

Experimented in #80

Result: True

The suggestion perfectly makes sense, but not every cases.

I've confirmed that string concatenation does indeed incur overhead, but it's only a problem when the segment is larger than a single character. In the typical case (the alphabet), it's likely to be a single character.

// case 1. BMP. There is no string concat.
segment = input[cursor];

// case 2. >= SMP
segment += input[cursor++];

However, with larger segment sizes, larger perf improvement. In extreme cases, such as with Demonic characters, perf gain is over 100%.

Claim 3. Inlining

Problem: All characters went through generic cat() function with binary search
Solution: Inline category detection for ASCII characters (< 127) directly in the main loop
Impact: ~90% of characters in typical text get faster processing

Experimented in #81

Result: False

The suggestion didn't really help.

It makes code bloat, but there was no impact on performance.

Claim 4. Prioritizing common cases

Problem: Boundary rules were checked in specification order, not frequency order
Solution: Reordered isBoundary() checks to handle most common cases first:

GB9/GB9a (extend rules) moved to top as they're the most frequent "no break" cases

GB3 (CR x LF) must come before GB4/GB5 to handle correctly

Simplified Hangul rules for better performance
Impact: Faster short-circuiting for common character sequences

Experimented in #82

Result: True? (not 100% sure)

The claim is valid. Prioritizing the most common cases is a legit strategy. But assuming something is the most common can often be inaccurate or dangerous.

The suggested change made the code less intuitive but had no impact on the performance. However, it gave a hint for eliminating potentially unnecessary branches.

Lessons on using LLM agents

LLM could do basic optimization, but it doesn't help much on an already optimized path.
Suggestions may be incorrect or be based on insufficient evidence.
Because no benchmark is perfect, you should be cautious about making major changes based on it.
It's still your responsibility to interpret and verify the details.
They rarely give helpful hints. It's up to you.

Optimization by Claude Code w/ Opus

116db98

This comment was marked as off-topic.

Sign in to view

This was referenced Jul 30, 2025

Cleanup binary search #79

Merged

Optimize large segments #80

Merged

Inline optimization #81

Closed

Optimize grapheme cluster boundary checking #82

Merged

cometkim closed this Jul 30, 2025

cometkim deleted the opt-by-claude branch September 17, 2025 21:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization by Claude Code w/ Opus #78

Optimization by Claude Code w/ Opus #78

Uh oh!

cometkim commented Jul 30, 2025 •

edited

Loading

Uh oh!

This comment was marked as off-topic.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Optimization by Claude Code w/ Opus #78

Optimization by Claude Code w/ Opus #78

Uh oh!

Conversation

cometkim commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Verifying the claims

Claim 1. Binary search operation

Claim 2. String build optimization

Claim 3. Inlining

Claim 4. Prioritizing common cases

Lessons on using LLM agents

Uh oh!

This comment was marked as off-topic.

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cometkim commented Jul 30, 2025 •

edited

Loading